{ "cells": [ { "cell_type": "markdown", "id": "3a9ef243", "metadata": {}, "source": [ "# Latin Hypercube Sampling in MUSICA\n", "In the previous example, all of the grid cells had to be filled in manually with data.\n", "This makes it not practically useful on its own since it would be tedious to make a system with many grid cells.\n", "One solution to this is Latin Hypercube Sampling (LHS), a statistical method for generating multidimensional random samples.\n", "It avoids the problem of clustering that can sometimes appear in pure random sampling.\n", "This tutorial will go over a simple example of utilizing LHS to run a multi-grid-cell solver in MUSICA.\n", "For documentation on Latin Hypercube Sampling, go [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.qmc.LatinHypercube.html)." ] }, { "cell_type": "markdown", "id": "d50c8f23", "metadata": {}, "source": [ "## 1. Importing Libraries\n", "Below is a list of the required libraries for this tutorial:" ] }, { "cell_type": "code", "execution_count": 1, "id": "5c595ccd", "metadata": {}, "outputs": [], "source": [ "import musica\n", "import musica.mechanism_configuration as mc\n", "import matplotlib.pyplot as plt\n", "from scipy.stats import qmc\n", "import pandas as pd\n", "import numpy as np\n", "import seaborn as sns\n", "pd.set_option('display.float_format', str) # This is done to make the arrays more readable\n", "np.set_printoptions(suppress=True) # This is done to make the arrays more readable" ] }, { "cell_type": "markdown", "id": "7f80a7cf", "metadata": {}, "source": [ "## 2. Rerunning Previous Code\n", "This is simply a copy of the first 4 steps from the [Multiple Grid Cells Tutorial](1.%20multiple_grid_cells.ipynb) since the code is identical aside from the number of grid cells being increased.\n", "If you would like to view the explanation for this code, refer to the hyperlink on the above line." ] }, { "cell_type": "code", "execution_count": 2, "id": "8724138e", "metadata": {}, "outputs": [], "source": [ "A = mc.Species(name=\"A\")\n", "B = mc.Species(name=\"B\")\n", "C = mc.Species(name=\"C\")\n", "species = [A, B, C]\n", "gas = mc.Phase(name=\"gas\", species=species)\n", "\n", "r1 = mc.Arrhenius(\n", " name=\"A_to_B\",\n", " A=4.0e-3, # Pre-exponential factor\n", " C=50, # Activation energy (units assumed to be K)\n", " reactants=[A],\n", " products=[B],\n", " gas_phase=gas\n", ")\n", "\n", "r2 = mc.Arrhenius(\n", " name=\"B_to_C\",\n", " A=4.0e-3,\n", " C=50, \n", " reactants=[B],\n", " products=[C],\n", " gas_phase=gas\n", ")\n", "\n", "mechanism = mc.Mechanism(\n", " name=\"musica_micm_example\",\n", " species=species,\n", " phases=[gas],\n", " reactions=[r1, r2]\n", ")\n", "\n", "solver = musica.MICM(mechanism = mechanism, solver_type = musica.SolverType.rosenbrock_standard_order)\n", "\n", "num_grid_cells = 100\n", "state = solver.create_state(num_grid_cells)" ] }, { "cell_type": "markdown", "id": "99119268", "metadata": {}, "source": [ "## 3. Creating and Scaling a Latin Hypercube Sampler\n", "This Latin Hypercube Sampler uses the same 5 dimensions as the [Multiple Grid Cells Tutorial](1.%20multiple_grid_cells.ipynb) to randomize each of the individual systems, being:\n", "* temperature (Kelvin),\n", "* pressure (Pascals), and\n", "* the concentrations of each of the species (mol/m3).\n", "\n", "Next, the LHS is created with the provided number of dimensions with a randomized sample that will be scaled by the sampler.\n", "The upper and lower bounds for each of the five dimensions are then set, and the sample is scaled with those bounds by the LHS.\n", "Do note as before that the ordering inside the bounding arrays matters and cannot be changed." ] }, { "cell_type": "code", "execution_count": 3, "id": "773d1802", "metadata": {}, "outputs": [], "source": [ "ndim = 5\n", "nsamples = num_grid_cells\n", "\n", "# Create a Latin Hypercube sampler in the unit hypercube\n", "sampler = qmc.LatinHypercube(d=ndim)\n", "\n", "# Generate samples\n", "sample = sampler.random(n=nsamples)\n", "\n", "# Define bounds for each dimension\n", "l_bounds = [275, 100753.3, 0, 0, 0] # Lower bounds\n", "u_bounds = [325, 101753.3, 10, 10, 10] # Upper bounds\n", "\n", "# Scale the samples to the defined bounds\n", "sample_scaled = qmc.scale(sample, l_bounds, u_bounds)" ] }, { "cell_type": "markdown", "id": "82364d1b", "metadata": {}, "source": [ "## 4. Splitting up the Array Output and Running the Solver\n", "This code follows the same flow as the [Multiple Grid Cells Tutorial](1.%20multiple_grid_cells.ipynb) but with the new LHS values.\n", "The only notable change is the old box_model_values array becoming the sample_scaled output array from the LHS." ] }, { "cell_type": "code", "execution_count": 4, "id": "4ae18af9", "metadata": {}, "outputs": [], "source": [ "temperatures = sample_scaled[:, 0]\n", "pressures = sample_scaled[:, 1]\n", "concentrations = {\n", " \"A\": [],\n", " \"B\": [],\n", " \"C\": []\n", "}\n", "concentrations[\"A\"] = sample_scaled[:, 2]\n", "concentrations[\"B\"] = sample_scaled[:, 3]\n", "concentrations[\"C\"] = sample_scaled[:, 4]\n", "\n", "state.set_conditions(temperatures, pressures)\n", "state.set_concentrations(concentrations)\n", "concentrations_solved = []\n", "time_step_length = 1\n", "sim_length = 60\n", "curr_time = 0\n", "\n", "while curr_time <= sim_length:\n", " solver.solve(state, curr_time)\n", " concentrations_solved.append(state.get_concentrations())\n", " curr_time += time_step_length" ] }, { "cell_type": "markdown", "id": "cd065d8a", "metadata": {}, "source": [ "## 5. Expanding out the DataFrame (Advanced; Optional Read)\n", "The intention of this code snippet is to split up each grid cell for each time step onto a separate row in the DataFrame so they can be averaged when plotted.\n", "This will produce a confidence interval for each time step since there will be a set of unique values at every time step for each species.\n", "As an example, if there are 2 grid cells and 5 time steps, the data table will have 10 rows, with the first being the first grid cell for the first time step, the second row being the second grid cell for the first time step, the third row being the first grid cell for the second time step, and so on.\n", "You can see this pattern in the displayed DataFrame as well.\n", "After the expansion of the DataFrame, adding the other columns is fairly similar; they are simply given more rows since the number of rows is now the product of the number of grid cells and the number of time steps.\n", "The time column is notably different, however, since each time step has to be repeated for every grid cell.\n", "Due to the complexity of this code cell, it has been bundled into a function which is then called.\n", "Once that is all done, the expanded DataFrame is displayed.\n", "Note that the first column is the index of the DataFrame, not the time step here." ] }, { "cell_type": "code", "execution_count": 5, "id": "92625aa8", "metadata": {}, "outputs": [], "source": [ "def convert_results_all_cells():\n", " concentrations_solved_pd = []\n", " time = []\n", " for i in range(0, sim_length + 1, time_step_length):\n", " for j in range(0, num_grid_cells):\n", " concentrations_solved_pd.append({species: concentration[j] for species, concentration in concentrations_solved[int(i/time_step_length)].items()})\n", " time.append(i)\n", " df = pd.DataFrame(concentrations_solved_pd)\n", " df = df.rename(columns = {'A' : 'CONC.A.mol m-3', 'B' : 'CONC.B.mol m-3', 'C' : 'CONC.C.mol m-3'})\n", " df['time.s'] = time\n", " df['ENV.temperature.K'] = np.repeat(temperatures[0], (sim_length/time_step_length + 1.0) * num_grid_cells)\n", " df['ENV.pressure.Pa'] = np.repeat(pressures[0], (sim_length/time_step_length + 1.0) * num_grid_cells)\n", " df['ENV.air number density.mol m-3'] = np.repeat(state.get_conditions()['air_density'][0], (sim_length/time_step_length + 1.0) * num_grid_cells)\n", " df = df[['time.s', 'ENV.temperature.K', 'ENV.pressure.Pa', 'ENV.air number density.mol m-3', 'CONC.A.mol m-3', 'CONC.B.mol m-3', 'CONC.C.mol m-3']]\n", " return concentrations_solved_pd, df" ] }, { "cell_type": "code", "execution_count": 6, "id": "a338e7f6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | time.s | \n", "ENV.temperature.K | \n", "ENV.pressure.Pa | \n", "ENV.air number density.mol m-3 | \n", "CONC.A.mol m-3 | \n", "CONC.B.mol m-3 | \n", "CONC.C.mol m-3 | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "9.603213213571069 | \n", "9.101503007047965 | \n", "1.3791552759048176 | \n", "
| 1 | \n", "0 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "6.316518664406818 | \n", "3.5442239182495494 | \n", "7.5309574382359585 | \n", "
| 2 | \n", "0 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "7.60797163127235 | \n", "4.98172822290733 | \n", "8.77528825739052 | \n", "
| 3 | \n", "0 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "2.15189275842202 | \n", "4.182414864738023 | \n", "8.078019549492074 | \n", "
| 4 | \n", "0 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "9.209637754005923 | \n", "5.905763265736759 | \n", "3.133282102633176 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 6095 | \n", "60 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "0.0006042084606455776 | \n", "0.005697973414917429 | \n", "9.121982643142896 | \n", "
| 6096 | \n", "60 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "0.0017057891805664078 | \n", "0.01528585497949259 | \n", "17.78465003247812 | \n", "
| 6097 | \n", "60 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "0.0006216717116591639 | \n", "0.00551919020208598 | \n", "13.319388770527299 | \n", "
| 6098 | \n", "60 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "0.0007354727081189604 | \n", "0.007513964364701525 | \n", "15.431405934122955 | \n", "
| 6099 | \n", "60 | \n", "286.00321749782404 | \n", "101281.96283938194 | \n", "42.59189914230601 | \n", "0.00011133657751124476 | \n", "0.0026815740270777354 | \n", "12.495870560349145 | \n", "
6100 rows × 7 columns
\n", "